Semi-automatic Composition of Data Layout Transformations for Loop Vectorization

نویسندگان

  • Shixiong Xu
  • David Gregg
چکیده

In this paper we put forward an annotation system for specifying a sequence of data layout transformations for loop vectorization. We propose four basic primitives for data layout transformations that programmers can compose to achieve complex data layout transformations. Our system automatically modifies all loops and other code operating on the transformed arrays. In addition, we propose data layout aware loop transformations to reduce the overhead of address computation and help vectorization. Taking the Scalar Penta-diagonal (SP) solver, from the NAS Parallel Benchmarks as a case study, we show that the programmer can achieve significant speedups using our annotations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lecture Notes on Loop Transformations for Cache Optimization 15-411: Compiler Design

In this lecture we consider loop transformations that can be used for cache optimization. The transformations can improve cache locality of the loop traversal or enable other optimizations that have been impossible before due to bad data dependencies. Those loop transformations can be used in a very flexible way and are used repeatedly until the loop dependencies are well aligned with the memor...

متن کامل

Joint Scheduling and Layout Optimization to Enable Multi-Level Vectorization

We describe a novel loop nest scheduling strategy implemented in the R-Stream compiler : the first scheduling formulation to jointly optimize a trade-off between parallelism, locality, contiguity of array accesses and data layout permutations in a single complete formulation. Our search space contains the maximal amount of vectorization in the program and automatically finds opportunities for a...

متن کامل

High-Level Loop Optimizations for GCC

This paper will present a design for loop optimizations using high-level loop transformations. We will describe a loop optimization infrastructure based on improved induction variable, scalar evolution, and data dependence analysis. We also will describe loop transformation opportunities that utilize the information discovered. These transformations increase data locality and eliminate data dep...

متن کامل

Lecture Notes on Linear Cache Optimization & Vectorization 15 - 411 : Compiler Design

The big missing questions on cache optimization are how and when generally to transform loops? What is the best choice to find a loop transformation? Is there a big common systematic picture? How to get fast by vectorizing and/or parallelizing loops after the loop transformations have made some loops parallelizable? And, finally, how can we use more fancy transformations for complicated problems.

متن کامل

Notes on Linear Cache Optimization & Vectorization 15 - 411 : Compiler Design André Platzer

The big missing questions on cache optimization are how and when generally to transform loops? What is the best choice to find a loop transformation? Is there a big common systematic picture? How to get fast by vectorizing and/or parallelizing loops after the loop transformations have made some loops parallelizable? And, finally, how can we use more fancy transformations for complicated problems.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014